Automata on Lempel-ziv Compressed Strings
نویسندگان
چکیده
Using the Lempel-Ziv-78 compression algorithm to compress a string yields a dictionary of substrings, i.e. an edge-labelled tree with an order-compatible enumeration, here called an LZ-trie. Queries about strings translate to queries about LZ-tries and hence can in principle be answered without decompression. We compare notions of automata accepting LZ-tries and consider the relation between acceptable and MSOdefinable classes of LZ-tries. It turns out that regular properties of strings can be checked efficiently on compressed strings by LZ-trie automata.
منابع مشابه
Definability and Compression
A compression algorithm takes a finite structure of a class K as input and produces a finite structure of a different class K’ as output. Given a property P on the class K defined in a logicL, we study the definability of property P on the class K’. We consider two compression schemas on unary ordered structures (words), compression by runlength encoding and the classical Lempel-Ziv. First-orde...
متن کاملDeenability and Compression
A compression algorithm takes a nite structure of a class K as input and produces a nite structure of a diierent class K' as output. Given a property P on the class K deened in a logic L, we study the deenability of property P on the class K'. We consider two compression schemas on unary ordered structures (words), a naive compression and the classical Lempel-Ziv. First-order properties of stri...
متن کاملEecient Algorithms for Lempel-ziv Encoding
We consider several basic problems for texts and show that if the input texts are given by their Lempel-Ziv codes then the problems can be solved deterministically in polynomial time in the case when the original (uncompressed) texts are of exponential size. The growing importance of massively stored information requires new approaches to algorithms for compressed texts without decompressing. D...
متن کاملBounded Pushdown Dimension vs Lempel Ziv Information Density
In this paper we introduce a variant of pushdown dimension called bounded pushdown (BPD) dimension, that measures the density of information contained in a sequence, relative to a BPD automata, i.e. a finite state machine equipped with an extra infinite memory stack, with the additional requirement that every input symbol only allows a bounded number of stack movements. BPD automata are a natur...
متن کاملTranslating the EAH Data Compression Algorithm into Automata Theory
Adaptive codes have been introduced in [5] as a new class of non-standard variablelength codes. These codes associate variable-length codewords to symbols being encoded depending on the previous symbols in the input data string. A new data compression algorithm, called EAH, has been introduced in [7], where we have behaviorally shown that for a large class of input data strings, this algorithm ...
متن کامل